DOCOMENT HESOME 



ED 129 754 



SP 010 469 



AOTHOS 
TITLE 



PDB DATE 
NOTE 

EDRS PRICE 
DESCRIPTORS 



Wagener, Elaine H, 

Use of Student-Composed Tests and Their Effect on the 

Attitudes and Task Performance of University 

Students, 

[76] 

12p, 

MF-$0,83 HC-$1.67 Plus Postage, 

Achievement Rating; Cognitive Measurement; Cognitive 
Processes ; *College Students; Evaluation ; *Grading; 
Higher Education ; Learning Processes; Seminars; 
Statistical Analysis; *Student Attitudes; *StudGnt 
Seminars ; Task Performance ; *Test Construction ; 
Tests 



ABSTRACT 

The experiment described was designed to evaluate the 
possible effecto on students in a small seminar of an evaluation 
system in which students were freed from some of the pressures of the 
conventional grading system and allowed to participate in the grading 
prc::ess, and to determine whether such participation would affect the 
acquisition of course content and attitudes of students toward that 
course. One of thr«^e randomly selected seminar groups was chosen to 
be the experimental group in which the grading process was altered. 
Pretesting was done covering course content to establish relative 
equality among the groups. An analysis of variance revealed no 
statistically significant difference in the group that constructed 
its own tests and assigned its own points for seminar participation 
and the two control groups, which were graded by a mere traditional 
method. Chi Square analyses were performed on the attitude data 
revealing no significant differences in the groups on attitudes 
toward unit quizzes, course grading, or small group seminars. 
Although statistical analysis indicated no significant differences in 
attitudes in the groups, the study revealed that students felt the 
experience was valuable in learning to compose valid test questions, 
that it freed them from memorizing irrelevant details, released them 
from tension, and allowed a more receptive mindset for hearing and 
listening as well as allowing for increased teacher-student 
communication, rapport, and appreciation. (Author/JHF) 
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USt: OF STUDENT-COMPOSED TESTS AND THEIR EFFECT ON HIE 
ATTITUDES AND TASK PERFOR>LVNCE OF UNIVERSITY STUDENTS 



Grading pracDices have remained at the status quo level in spite 
of many innovations in education. Teachers face many questions within 
therriselves as to the validity or appropriateness of current grading 
procedures. Consequently, some new possibilities are being studied. 
In a report by Collins and Nickle (1) five hundred forty-four institutions 
of higher education were surveyed. Institutions replying seemed tc be 
experimenting with different grading systems, using traditional forms 
within the student^ s major area and using non- tradi tional grading 
systems outside the major area of study. Many research studies seem to 
indicate that grading has an anxiety producing effect on students or in 
some way affected motivation which in many cases negatively affected task 
performance (2, 3, 4, 5). Tn a survey of university students, 65.8 percent 
felt that grades interfered with learning because of undue pressure on 
grades (6). This negative effect on the developmenl: of intrinsic and 
permanent intellectual interests and the failure of grades to motivate 
learning are emphasized in a report of evaluation practices at the 
University of California (7). 

Several studies reported no significant differences in either 
attitude or performance when variations in grading practices were used — 
criterion referenced grading (8), contracting (9), or indiscriminately 
raising the grades of every student by one level in an experimental group 
of high school Spanish students (10). Other studies revealed positive 
• suits. Emerson (11) reported positive student attitudes toward a 
social science course and tne grading method when a system of points 
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was used in evaluating objective tests; textbook assignments; and 
reviews of tapes, movies and articles. In a study by Good (12) of 
one hundred forty-seven high school biology students there was no 
significant difference at the .05 level in achievement, but self-graded 
students were more productive and had a significantly different level 
of aspiration. In response to this problem and with the desire to 
implement a more productive approach to grading, the study described 
in this paper was undertaken. 

DESCRIPTION OF THE STUDY 

Tills experiment was designed to evaluate the possible effects on 
students in a small seminar of an evaluation system in which students 
were freed from some of the pressures of the conventional grading system 
and allowed to part icip ate in the grading process. 

The students were primarily sophomores wli ; were education majors 
in a southern university. The course was a required survey education 
course with an introduction to educational psychology, learning theory, 
educational legislation, local school politics, teacher organizations and 
unions, and current educational issues. The pretest and posttest which 
was given at one time to the fifteen seminar sections was fifty multiple- 
choice items based on the content of the above items. Information 
concerning these topics was placed on cassette tapes in an independent 
study lab. Bi-weekly quizzee which were also objective multiple-choice 
type questions were composed jointly by the five seminar leaders and given 
to all the students in the seminars. The bi-weekly quizzes covered infor- 
mation given on assigned tapes. Seminars were designed to provide discus- 
sion of questions which arose out of the tapes and to review questions 
from quizzes of the previous week. 



Three of the fifteen seminar i^roups were chosen to participate in 
the experiment. The three groups were all conducted by the same seminar 
leader. The students were assigned to the groups randomly by computer. 
One of the three groups was assigned randomly to the experimental treatment. 

It was hypothesized that the experimental students* attitudes toward 
the seminars and the testing process would be more positive if they 
participated in the construction of the bi-weekly quizzes and graded 
their own participation in the seminars. It was also hypothesized that 
the scores of experimental subjects on a pretest and posttest which 
evaluated knowledge of material available on cassette tapes would compare 
favorably; v/ith the scores of subjects in the control groups. Acceptance 
of the hypothesis would seem to indicate that the removal of grading 
incentives did not adversely affect task performance. 

The two control groups were subjected to a grading system in which 
tests were taken bi-weekly and two points were given for each test passed 
at a master^/ level of eighty percent or above. The quizzes covered content 
presented in tapes available in the independent study lab and assigned 
weekly. They were constructed by the staff of seminar leaders and the 
professor in charge of the project. Quizzes could be taken again for 
half credit if they were not passed satisfactorily the first time. 

The experimental group listened to the tapes in the listening lab 
at whatever time they preferred. The first quiz was given in the same 
manner as in the control groups. However, the students were told that on 
the tapes that followed they would write their own questions which would 
be answered by all seminar members. It was explained that this was an 
attempt to free them from listening to the tapes "in order to answer 
questions on a quiz." They were told that this process would perhaps 
enable them to listen to assimilate information which might be of value 



to them. Each student was to listcMi to all four or five capes but 
signed up to write two questions on one specific tape. Each student 
was to use the questions on the first quiz as a model for his or her 
questions. They were instructed to try not to ask trick questions but 
to compose questions which were related to what they considered to be 
the mo«t important information on the tape. 

Each, question written by a student was to be answered by all other 
students in the seminar. After the evaluation period, the questions were 
to be discussed both to clarify content and to consider problems involved 
in writing appropriate questions which would be clear and unambiguous. 

Each student who came prepared with his or her questions and 
participated in the evaluation and discussion was given the right to 
evaluate his own participation in the entire process and assign himself 
up to six points for the three quizzes. This part of the design was 
included to relieve students of the pressure of answering questions 
which might be unclear or poorly constructed and place the emphasis on 
the content rather than on grades. 

If a student was absent or came to class unprepared, he would 
automatically take the standard qui:: used in the other seminars and be 
graded on the usual basis. 

Students in all three seminars were given a 50-item multiple-choice 
pretest based on questions similar to those used on the four quizzes. 
ITiey were -,Lven this same test again at the last seminar. A comprehensive 
evaluation form was completed by all students taking this course and 
this evaluation form was completed by all students taking this course 
and this evaluation was used as a basis for comparing the attitudes of 
the experimental group with the control groups. In addition, each student 
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in the experimental group was askc^d to make comments on the experimental 
process . 

RESULTS AND DISCUSSION 

The Subj ec ts /Groups X test analysis of variance indicates that 
there is a significant pretest vs. posttest main effect (p ^ .01) as 
might be predicted in most courses of study. However, the Between 
Groups main effect and the Groups by Trials interactions were not 
significant (Table 1) indicating that the student participation in the 
preparation of the bi-weekly quizzes and self-evaluation of their 
participation in weekly seminars neither affected their performance 
negatively or positively. These findings indicated that this process 
was neither superior nor inferior to traditional grading methods. 
Content was retained equally well by students who were studying for a 
test and students who were not operatin, nder the test-pressure syndrome. 

TABLE 1 
Ss/Gps X Tests ANOVA 
Source of Variations df SS MS F 



Between _Ss 38 838. A8 

e two en Gps X 2 

Ss/Gps 36 

Within Ss 39 

Overall Tests 1 

Groups X Tests 2 

(SS X Tests) /Gps 36 



2736.50 



30.33 
808.15 

2272. 32 
A7.41 

416.76 



15.16 
20. 72 

2272.32 
23. 70 

11.57 



< 1 



196.28* 
2.04 



Total 



77 



3574.98 



'•^p < .01 
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The final attitude evaluation form I'ontained eighteen items 
measured on a five point Likert scale ranging from Unsatisfactory 
(1) through Adequate (3) to Very Satisfactory (5). Since this experiment 
was designed to measure difference in attitude toward grading, testing, 
and seminar sessions, these will be the only comparisons reported. 

It was hypothesized that the attitudes of the experimental group 
would be significantly higher than the control groups. Chi Square 
analyses were performed to ascertain if the experimental treatment did 
indeed affect the attitude of students concerning unit quizzes, course 
grading, or small groups seminars (Tables 2, 3, and A). 



TABLE 2 
Unit Quizzes 





1 


2 


3 


k 


5 


Ss 


!i]xperimental 
Group 


.5 

.67 


.5 

1 


3.5 
3.33 


3.5 

A 


4.5 

4 


13 


Controlj^ 


.5 
.67 


1.1 

1 


3.5 
3.33 


U 
k 


4 
4 


13 


Control^ 


.5 

.67 


1 
1 


3.5 
3.53 


A. 5 

A 


3.5 

4 


13 




2 


3 


10 


12 


12 





^ 8 
X = ,6554 df 
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TABLE 3 
Course Grading 





1 


2 


3 


A 


5 


Ss 




Experimental 


.5 


.5 


1.5 


3.5 


7 


13 


Group 


.33 


.67 


1.67 


3.33 


7 




Control^ 


.5 
.33 


.5 
.67 


1.5 
1.67 


3.5 
3.33 


7.5 
7 


13 




.5 
.33 


.5 
.67 


1.5 
1.67 


3.5 
3.33 


7.5 

7 


13 




1 


2 


5 


10 


21 




- .3888 


















TART F L 












1 


2 


3 


A 


5 


Ss 


Experimental 


0 


.5 


1.5 


A. 5 


6.5 


13 


Group 


0 


.33 


2.67 


A 


6 




Control^ 


0 
0 


.5 
,33 


3.5 
2.67 


2.5 


6.5 

6 


13 


Control^ 


0 
0 


.5 

.33 


2.5 
2.67 


A. 5 

A 


4.5 

6 


13 




0 


1 


8 


12 


18 





= 2.2180 



all X are not significant 
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However, the X" analyses indicated no significant differenc<^ at the .05 
level between groups in their attitudes on these three items. (It 
should be noted that the Yates correction for continuity was used 
due to the small expected frequencies). The liypo thesit' roncerning 
student attitudes was therefore rejected. 

CONCLUSIONS 

Although statistical analysis- indicated no significant differences 
in attitude' f tVe three groups, student co-^ments concerning the 
experimencal i>rocedure revealed their perceptions about the advantat;es 
of this method. They felt it was valuable in learning to compose valid 
test questions, freed them ^rom memorizing irrelevant details, released 
them from tension and allowed a more receptive mindset for hearing the 
tapes. Another benefit received was incre.ised teacher-student communica- 
tion, rapport and appreciation. 

Long before John Holt wrote How Children Fail the stance regarding 
grading as a motivational device has been ques tioned . Teacher training 
institutions have been plagued with teaching one philosophy with regard 
to the grading process, particularly in elementary education, and have 
continued to incorporate procedures which contradict that instruction. 
The idea that W"'" thout grades students wou"'d fail to learn or would cease 
expending maximum effort and time in preparation has been an overriding 
factor in the decision to retain grading. 

This study would seem to indicate that students learn as mu':h when 
emphasis is placed on content other than grades even thou;^h their atti- 
tudes toward the learning process are not significantly affected by 
participation in the grading process. Perhaps John Holt^s position 
would bear a more careful consideration. 
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"...any tests tliat are not .i [personal matter l-)etwecn the 
learner and someone helping him learn but were given instead 
to grade and label for souieone else's purposes ... are 
illegitimate and harmful" (9) . 
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